Fuzzy Clustering: Consistency of Entropy Regularization
نویسندگان
چکیده
We introduce in this paper a new formulation of the regularized fuzzy C-means (FCM) algorithm which allows us to find automatically the actual number of clusters. The approach is based on the minimization of an objective function which mixes, via a particular parameter, a classical FCM term and a new entropy regularizer. The main contribution of the method is the introduction of a new exponential form of the fuzzy memberships which ensures the consistency of their bounds and makes it possible to interpret the mixing parameter as the variance (or scale) of the clusters. This variance closely related to the number of clusters, provides us with an intuitive and an easy to set parameter. We will discuss the proposed approach from the regularization point-of-view and we will demonstrate its validity both analytically and experimentally. We will show an extension of the method to non-linearly separable data. Finally, we will illustrate preliminary results both on simple toy examples as well as image segmentation and database categorization problems. ∗Accepted in International Conference on Computational Intelligence (Special Session on Fuzzy Clustering), Dortmund, Germany, September 2004.
منابع مشابه
Relative entropy fuzzy c-means clustering
Pattern recognition is a collection of computer techniques to classify various observations into different clusters of similar attributes in either supervised or unsupervised manner. Application of fuzzy logic to unsupervised classification or clustering methods has resulted in many wildly used techniques such as fuzzy c-means (FCM) method. However, when the observations are too noisy, the perf...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کاملDeterministic and Simulated Annealing Approach to Fuzzy C-means Clustering
This paper explains the approximation of a membership function obtained by entropy regularization of the fuzzy c-means (FCM) method. By regularizing FCM with fuzzy entropy, a membership function similar to the Fermi-Dirac distribution function is obtained. We propose a new clustering method, in which the minimum of the Helmholtz free energy for FCM is searched by deterministic annealing (DA), w...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004